Search CORE

809 research outputs found

Game-theoretical control with continuous action sets

Author: Leslie David S.
Mertikopoulos Panayotis
Perkins Steven
Publication venue
Publication date: 01/01/2014
Field of study

Motivated by the recent applications of game-theoretical learning techniques to the design of distributed control systems, we study a class of control problems that can be formulated as potential games with continuous action sets, and we propose an actor-critic reinforcement learning algorithm that provably converges to equilibrium in this class of problems. The method employed is to analyse the learning process under study through a mean-field dynamical system that evolves in an infinite-dimensional function space (the space of probability distributions over the players' continuous controls). To do so, we extend the theory of finite-dimensional two-timescale stochastic approximation to an infinite-dimensional, Banach space setting, and we prove that the continuous dynamics of the process converge to equilibrium in the case of potential games. These results combine to give a provably-convergent learning algorithm in which players do not need to keep track of the controls selected by the other agents.Comment: 19 page

arXiv.org e-Print Archive

Lancaster E-Prints

A phase transition for measure-valued SIR epidemic processes

Author: Lalley Steven P.
Perkins Edwin A.
Zheng Xinghua
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2012
Field of study

We consider measure-valued processes

X=(X_t)

that solve the following martingale problem: for a given initial measure

X_0

, and for all smooth, compactly supported test functions

\varphi

, \begin{eqnarray*}X_t(\varphi )=X_0(\varphi)+\frac{1}{2}\int _0^tX_s(\Delta \varphi )\,ds+\theta \int_0^tX_s(\varphi )\,ds\\{}-\int_0^tX_s(L_s\varphi )\,ds+M_t(\varphi ).\end{eqnarray*} Here

L_s(x)

is the local time density process associated with

X

, and

M_t(\varphi )

is a martingale with quadratic variation

[M(\varphi )]_t=\int_0^tX_s(\varphi ^2)\,ds

. Such processes arise as scaling limits of SIR epidemic models. We show that there exist critical values

\theta_c(d)\in(0,\infty)

for dimensions

d=2,3

such that if

\theta>\theta_c(d)

, then the solution survives forever with positive probability, but if

\theta<\theta_c(d)

, then the solution dies out in finite time with probability 1. For

d=1

we prove that the solution dies out almost surely for all values of

\theta

. We also show that in dimensions

d=2,3

the process dies out locally almost surely for any value of

\theta

; that is, for any compact set

K

, the process

X_t(K)=0

eventually.Comment: Published in at http://dx.doi.org/10.1214/13-AOP846 the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Hong Kong University of Science and Technology Institutional Repository

Supersymmetric Ward Identities and NMHV Amplitudes involving Gluinos

Author: Bidder Steven J.
Dunbar David C.
Perkins Warren B.
Publication venue: 'IOP Publishing'
Publication date: 27/05/2005
Field of study

We show how Supersymmetric Ward identities can be used to obtain amplitudes involving gluinos or adjoint scalars from purely gluonic amplitudes. We obtain results for all one-loop six-point NMHV amplitudes in \NeqFour Super Yang-Mills theory which involve two gluinos or two scalar particles. More general cases are also discussed.Comment: 32 pages, minor typos fixed; one reference adde

arXiv.org e-Print Archive

Cronfa at Swansea University

CERN Document Server

Guide to Researching International Human Rights Law

Author: Perkins Steven C.
Publication venue: Case Western Reserve University School of Law Scholarly Commons
Publication date: 01/01/1992
Field of study

Case Western Reserve University School of Law

Guide to Researching International Human Rights Law

Author: Perkins Steven C.
Publication venue: Case Western Reserve University School of Law Scholarly Commons
Publication date: 01/01/1992
Field of study

bepress Legal Repository

Case Western Reserve University School of Law

Staff Perceptions of Standards-Based Grading Prior To Implementation

Author: Perkins Steven K
Publication venue: Digital Commons@NLU
Publication date: 01/12/2021
Field of study

The purpose of this qualitative study was to evaluate the perceptions of a group of middle school teachers regarding changing to standards-based grading (SBG). Data were collected from the transcripts of two different focus groups and analyzed. Study results indicated that SBG measures were not well known by all staff, and many clear resistance points were present. Resistance points centered around five key themes: fear of loss of rigor, community pushback, lack of SBG practices knowledge, lack of supporting infrastructure, and extra time and work required. Recommendations that flow from these results are that, prior to implementing SBG, comprehensive data be collected regarding staff beliefs about grading and reporting in general, and that targeted, differentiated professional development be planned for staff based upon the results of the data collected. Continuing to expand SBG practices within schools is the ultimate goal due to the large body of research espousing its benefits

National-Louis University: OASIS - The NLU Digital Commons

Asynchronous Stochastic Approximation with Differential Inclusions

Author: Leslie David S.
Perkins Steven
Publication venue
Publication date: 10/12/2011
Field of study

The asymptotic pseudo-trajectory approach to stochastic approximation of Benaim, Hofbauer and Sorin is extended for asynchronous stochastic approximations with a set-valued mean field. The asynchronicity of the process is incorporated into the mean field to produce convergence results which remain similar to those of an equivalent synchronous process. In addition, this allows many of the restrictive assumptions previously associated with asynchronous stochastic approximation to be removed. The framework is extended for a coupled asynchronous stochastic approximation process with set-valued mean fields. Two-timescales arguments are used here in a similar manner to the original work in this area by Borkar. The applicability of this approach is demonstrated through learning in a Markov decision process.Comment: 41 page

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Lancaster E-Prints

Explore Bristol Research

Best-response Dynamics in Zero-sum Stochastic Games

Author: Leslie David
Perkins Steven
Xu Zibo
Publication venue: 'Elsevier BV'
Publication date: 01/09/2020
Field of study

We define and analyse three learning dynamics for two-player zero-sum discounted-payoff stochastic games. A continuous-time best-response dynamic in mixed strategies is proved to converge to the set of Nash equilibrium stationary strategies. Extending this, we introduce a fictitious-play-like process in a continuous-time embedding of a stochastic zero-sum game, which is again shown to converge to the set of Nash equilibrium strategies. Finally, we present a modified δ-converging best-response dynamic, in which the discount rate converges to 1, and the learned value converges to the asymptotic value of the zero-sum stochastic game. The critical feature of all the dynamic processes is a separation of adaption rates: beliefs about the value of states adapt more slowly than the strategies adapt, and in the case of the δ-converging dynamic the discount rate adapts more slowly than everything else

Lancaster E-Prints

Mixed-strategy learning with continuous action sets

Author: Leslie David Stuart
Mertikopoulos Panayotis
Perkins Steven
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/12/2015
Field of study

Motivated by the recent applications of game-theoretical learning to the design of distributed control systems, we study a class of control problems that can be formulated as potential games with continuous action sets. We propose an actor-critic reinforcement learning algorithm that adapts mixed strategies over continuous action spaces. To analyse the algorithm we extend the theory of finite-dimensional two-timescale stochastic approximation to a Banach space setting, and prove that the continuous dynamics of the process converge to equilibrium in the case of potential games. These results combine to give a provablyconvergent learning algorithm in which players do not need to keep track of the controls selected by other agents

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Lancaster E-Prints